stochastic asynchronous palm
The Sound of APALM Clapping: Faster Nonsmooth Nonconvex Optimization with Stochastic Asynchronous PALM
We introduce the Stochastic Asynchronous Proximal Alternating Linearized Minimization (SAPALM) method, a block coordinate stochastic proximal-gradient method for solving nonconvex, nonsmooth optimization problems. SAPALM is the first asynchronous parallel optimization method that provably converges on a large class of nonconvex, nonsmooth problems. We prove that SAPALM matches the best known rates of convergence --- among synchronous or asynchronous methods --- on this problem class. We provide upper bounds on the number of workers for which we can expect to see a linear speedup, which match the best bounds known for less complex problems, and show that in practice SAPALM achieves this linear speedup. We demonstrate state-of-the-art performance on several matrix factorization problems.
Reviews: The Sound of APALM Clapping: Faster Nonsmooth Nonconvex Optimization with Stochastic Asynchronous PALM
I find the approach rather interesting, especially the broad and general definition of the problem makes the approach applicable to a wide range of problems. However, I was surprised by the absence of any reference to the seminal Robbins/Munro paper and also to the recent developments in stochastic gradient descent based sampling (see below). The authors do local gradient descent updates of coordinate blocks by computing partial gradients and adding noise in each asynchronous step. I was wondering, how this relates to the "usual" stochastic gradient descent update, i.e., given that the locally computed partial gradient will be based on delayed (noisy) variable states, a sequence of these noisy partial gradients would converge to the true partial gradient as well. Further, recent SGD based sampling has shown that adding noise to the variable states obtained by noisy gradient updates (as the authors do as well) provides good samples of the distribution underlying the optimal variable setting also in a non-convex setting. That being said, the work handed in remains valid, but it would have been interesting to compare the proposed approach to well established stochastic gradient methods.
The Sound of APALM Clapping: Faster Nonsmooth Nonconvex Optimization with Stochastic Asynchronous PALM
Davis, Damek, Edmunds, Brent, Udell, Madeleine
We introduce the Stochastic Asynchronous Proximal Alternating Linearized Minimization (SAPALM) method, a block coordinate stochastic proximal-gradient method for solving nonconvex, nonsmooth optimization problems. SAPALM is the first asynchronous parallel optimization method that provably converges on a large class of nonconvex, nonsmooth problems. We prove that SAPALM matches the best known rates of convergence --- among synchronous or asynchronous methods --- on this problem class. We provide upper bounds on the number of workers for which we can expect to see a linear speedup, which match the best bounds known for less complex problems, and show that in practice SAPALM achieves this linear speedup. We demonstrate state-of-the-art performance on several matrix factorization problems.